File processing method, data processing apparatus, and storage medium

ABSTRACT

A file processing method, a data processing apparatus and a storage medium divide data and index data with respect to the data into a plurality of sections, and compress the sections to obtain a compressed file, and store the compressed file in a storage medium together with address information of the sections after the compression.

TECHNICAL FIELD

The present invention generally relates to file processing methods, dataprocessing apparatuses and storage mediums, and more particularly to afile processing method and a data processing apparatus which compress afile such as a dictionary file related to one or a plurality ofdictionaries, encyclopedias and the like, store the compressed file in astorage medium and read the stored file from the storage medium, and toa storage medium which stores a file such as a compressed dictionaryfile.

Recently, there are storage mediums such as a CD-ROM which prestoresinformation related to a dictionary, encyclopedia or the like. By makingaccess to such a CD-ROM from a computer, it is possible to read anddisplay the information related to the dictionary, encyclopedia or thelike. As a result, a large amount of information related to thedictionary, encyclopedia or the like can be stored in a single CD-ROMwhich is extremely compact. In addition, instead of obtaining thenecessary information by opening a dictionary, encyclopedia or the likewhile using a computer, the necessary information can be read from theCD-ROM, thereby making it possible to greatly reduce the time andtrouble to obtain the necessary information.

BACKGROUND ART

In a conventional CD-ROM which stores the information related to thedictionary, encyclopedia or the like, a dictionary file is made up of adictionary data and a data related to index (hereinafter referred to asan index data). For example, in the case of an encyclopedia, thedictionary data includes a data (hereinafter referred to as a text data)related to a text which explains the meaning of a word, a data(hereinafter referred to as an image data) related to an image showingan animal if the word describes the animal, for example, a data(hereinafter referred to as an audio data) related to a sound such as asinging of a bird if the word describes the bird, for example, and thelike. On the other hand, the index is used to retrieve a desireddictionary data from the dictionary file, and is provided with respectto the dictionary data. The index is sometimes also referred to as akeyword. The index data includes a pointer related to a heading, apointer related to an item, and the like. The data related to theheading includes a headword. Further, the data related to the itemincludes a headword, comment, and the like.

Conventionally, because the storage capacity of the CD-ROM is relativelylarge, the text data and the index data are stored in the CD-ROM withoutbeing compressed. On the other hand, the amount of information includedin the audio data and particularly the image data is large, and theaudio data and the image data are respectively compressed according toappropriate compression techniques before being stored in the CD-ROM.

However, if one CD-ROM is required for each dictionary or encyclopedia,it is troublesome to utilize the dictionary data. For this reason, it isdesirable to store the information related to a plurality ofdictionaries, encyclopedias or the like in a single CD-ROM, but in thiscase, there was a problem in that the amount of information to be storedmay exceed the storage capacity of the single CD-ROM even if thedictionary data is compressed. In addition, even in a case where thedictionary file to be stored in the CD-ROM relates to a singledictionary, encyclopedia or the like, as the amount of information ofthe dictionary file increases, the amount of information to be storedmay exceed the storage capacity of the single CD-ROM even when thedictionary data is compressed.

Accordingly, it is conceivable to not only compress the dictionary databut to compress the entire dictionary file, including the index data,when storing the information related to the dictionary, encyclopedia orthe like in the CD-ROM. But no method which is capable of efficientlycompressing the entire dictionary file by a relatively simple techniqueand capable of expanding the compressed dictionary file in a short timehas yet been proposed. Particularly in the case of the dictionary,encyclopedia or the like, the amount of information related to the indexdata is large. For this reason, if it takes a long time to carry out theprocess of restoring the index data when expanding the compresseddictionary file, an access time to the desired index data or dictionarydata becomes long, thereby deteriorating the convenience of thedictionary, encyclopedia or the like.

Moreover, when compressing the dictionary data in units of the item ofthe index or in units of a fixed length, for example, it takes a longtime to carry out the process of expanding the dictionary file becausethe amount of information related to the index data is largeparticularly in the case of the dictionary, encyclopedia or the like,thereby similarly deteriorating convenience of the dictionary,encyclopedia or the like. For example, a Japanese Laid-Open PatentApplication No.9-26969 proposes a telephone directory retrieval systemwhich employs a method similar to the above. However, this proposedmethod does not compress the index data. In the case of the telephonedirectory, the amount of information related to the index data is smallcompared to the amount of information related to the telephone number,family name, given name, corporate name and address which correspond tothe dictionary data. Consequently, the information compressionefficiency as a whole will not greatly improve even if the index data ofthe telephone directory were compressed. Therefore, even if thisproposed method were applied to the storage of the information relatedto the dictionary, encyclopedia or the like into the storage medium, theinformation compression efficiency of the dictionary file as a whilewill not improve considerably.

Accordingly, in a case where the amount of information related to theindex data is relatively large even when compared to the amount ofinformation related to the dictionary data, such as the case of thedictionary, encyclopedia or the like, there was a problem in that it isconventionally impossible to efficiently compress and store thedictionary file in the storage medium and to make access to thecompressed dictionary file in a short time by a relatively simpleprocess.

DISCLOSURE OF THE INVENTION

Hence, it is an object of the present invention to provide a fileprocessing method, a data processing apparatus and a storage mediumwhich are capable of efficiently compressing and storing a dictionaryfile in the storage medium and making access to the compresseddictionary file in a short time by a relatively simple process, even ina case where the amount of information related to an index data is largeeven when compared to the amount of information related to a dictionarydata, such as the case of a dictionary, encyclopedia or the like.

Another object of the present invention is to provide a file processingmethod comprising a compressing step dividing data and index data withrespect to the data into a plurality of sections, and compressing thesections to obtain a compressed file, and a storing step storing thecompressed file in a storage medium together with address information ofthe sections after the compression. According to the present invention,it is possible to efficiently compress and store in the storage medium afile such as a dictionary file which is formed by data including anindex, text of each item and the like. In addition, it is possible tocarry out a file retrieval at a high speed by a relatively simpleprocess, by expanding the compressed file for every section.

When each section has a fixed length, it becomes unnecessary to includeaddress information prior to the compression in the compressed file, andthe data compression efficiency can be improved. On the other hand, wheneach section has a variable length, and said storing step further storesaddress information prior to the compression in the storage medium, itis possible to carry out the data expansion at a high speed by settingthe section to an appropriate length depending on the data type andsection.

When the file processing method further comprises a restoring stepreading the compressed file from the storage medium and expanding eachof the sections, so as to restore the data and the index data, it ispossible to improve the file retrieval speed by using an auxiliarystorage unit capable of making a high-speed data access and storing therestored data and index data in the auxiliary storage unit.

When the compressing step uses a compression algorithm and a compressionparameter which are common to the data and the index data of each of thesections, it is possible to simplify the data compression process andthe data expansion process at the time of the data expansion by usingthe common compression algorithm and compression parameter. Moreparticularly, it is possible to use the Huffman code, the universal codeand the like as the compression algorithm.

Still another object of the present invention is to provide a fileprocessing method comprising a reading step reading a compressed filefrom a storage medium together with address information of each of aplurality of sections after compression, for each of the sections, saidcompressed file being obtained by dividing data and index data withrespect to the data into the sections and compressing the sections, anda restoring step expanding the compressed file and restoring the dataand the index data. According to the present invention, it is possibleto carry out a high-speed file retrieval by a relatively simple process,by carry out the expansion of the compressed file such as a compresseddictionary file for every section.

A further object of the present invention is to provide a dataprocessing apparatus comprising compressing means for dividing data andindex data with respect to the data into a plurality of sections, andcompressing the sections to obtain a compressed file, and storing meansfor storing the compressed file in a storage medium together withaddress information of the sections after the compression. According tothe present invention, it is possible to efficiently compress and storein the storage medium a file which is formed by data including an index,text of each item and the like. In addition, it is possible to carry outa file retrieval at a high speed by a relatively simple process, byexpanding the compressed file for every section.

Another object of the present invention is to provide a data processingapparatus comprising reading means for reading a compressed file from astorage medium together with address information of each of a pluralityof sections after compression, for each of the sections, said compressedfile being obtained by dividing data and index data with respect to thedata into the sections and compressing the sections, and restoring meansfor expanding the compressed file and restoring the data and the indexdata. According to the present invention, it is possible to carry out ahigh-speed file retrieval by a relatively simple process, by carry outthe expansion of the compressed file for every section.

Still another object of the present invention is to provide a storagemedium which stores computer-readable information, comprising readingmeans for causing a computer to read a compressed file from a storagemedium together with address information of each of a plurality ofsections after compression, for each of the sections, said compressedfile being obtained by dividing data and index data with respect to thedata into the sections and compressing the sections, and restoring meansfor causing the computer to expand the compressed file and restore thedata and the index data. According to the present invention, it ispossible to carry out a high-speed file retrieval by a relatively simpleprocess, by carry out the expansion of the compressed file for everysection.

A further object of the present invention is to provide a storage mediumwhich stores computer-readable information, comprising a compressed filestored together with address information of each of a plurality ofsections after compression, for each of the sections, said compressedfile being obtained by dividing data and index data with respect to thedata into the sections and compressing the sections, where saidcompressed file is compressed using a compression algorithm and acompression parameter which are common to the data and the index data ofeach of the sections. According to the present invention, it is possibleto efficiently compress and store a file in the storage medium. Inaddition, it is possible to carry out a file retrieval at a high speedby a relatively simple process, by expanding the compressed file forevery section.

Another object of the present invention is to provide a storage mediumwhich stores computer-readable information, including a program whichcauses a computer to carry out a compressing procedure for dividingdictionary data and index data with respect to the dictionary data intoa plurality of sections, and compressing the sections to obtain acompressed dictionary file, and a storing procedure for storing thecompressed dictionary file in the storage medium together with addressinformation of the sections after the compression. According to thepresent invention, it is possible to retrieve the file at a high speedby carrying out a relatively simple process.

Still another object of the present invention is to provide acomputer-readable storage medium storing a compressed file comprising acompressed data region storing compressed data obtained by dividing dataand index data with respect to the data into a plurality of sections andcompressing the sections, and an address information region storingaddress information after compression of the sections, and a compressionparameter region storing a compression parameter used for thecompression. According to the present invention it is possible toretrieve the file by carrying out a relatively simple process.

Therefore, according to the present invention, even in when the amountof information related to the index data is large even when comparedwith the amount of information related to the dictionary data, such asthe case of the dictionary, encyclopedia and the like, it is possible toefficiently compress and store the file such as the dictionary file inthe storage medium, and the file such as the compressed dictionary filecan be accessed within a short time by carrying out the relativelysimple process.

Other objects and further features of the present invention will beapparent from the following detailed description when read inconjunction with the accompanying drawings.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a system block diagram showing the general construction of acomputer system which employs an embodiment of a file processing method;

FIG. 2 is a flow chart for explaining a compression parameter computingprocess carried out by a CPU;

FIG. 3 is a diagram showing a data structure of the compressionparameter;

FIG. 4 is a flow chart for explaining a data compression process and anaddress information computing process carried out by the CPU;

FIG. 5 is a flow chart for explaining a compressed file composingprocess and a compressed file storing process carried out by the CPU;

FIG. 6 is a diagram for explaining the composing of the compressedfiles;

FIG. 7 is a flow chart for explaining an index read process carried outby the CPU;

FIG. 8 is a flow chart for explaining a data expansion process carriedout by the CPU; and

FIG. 9 is a flow chart for explaining a text read process carried out bythe CPU.

BEST MODE OF CARRYING OUT THE INVENTION

A description will be given of an embodiment of a file processing methodaccording to the present invention and an embodiment of a dataprocessing apparatus according to the present invention. This embodimentof the file processing method and this embodiment of the data processingapparatus employ an embodiment of a storage medium according to thepresent invention. In this embodiment of the storage medium, the presentinvention is applied to a CD-ROM. However, the present invention is ofcourse similarly applicable to optical information storage mediums otherthan the CD-ROM, magneto-optical storage mediums such as amagneto-optical disk, magnetic storage mediums such as a floppy disk,and various kinds of semiconductor memory devices.

FIG. 1 is a system block diagram showing the general construction of acomputer system applied with this embodiment of the file processingmethod, and corresponds to this embodiment of the data processingapparatus. The computer system shown in FIG. 1 generally includes acentral processing unit (CPU) 1, a main storage unit 2 made up of arandom access memory (RAM) or the like, an auxiliary storage unit 3 madeup of a hard disk drive or the like, an input device 4 made up of akeyboard, mouse or the like, a display unit 5, and a CD-ROM input/outputdevice 6 made up of a CD-ROM drive or the like, which are coupled via abus 9. Each element itself forming the computer system can be realizedby an element having a known construction.

The input device 4 is used to input instructions and data to the CPU 1.The CPU 1 carries out a process requested by a user by executing aprogram stored in the auxiliary storage unit 3 based on the instructionsand data which are input. The program stored in the auxiliary storageunit 3 may be pre-installed or, may be loaded from a CD-ROM 6 a which isloaded into the CD-ROM input/output device 6. The main storage unit 2 isused to temporarily store intermediate results of computing processes orthe like carried out by the CPU 1, data used by the operations, and thelike. The display unit 5 displays a result of the process carried out bythe CPU 1, and messages urging the user to input an instruction or data.It is possible to connect a printer (not shown) which prints the resultof the process carried out by the CPU 1 or the like, in place of thedisplay unit 5 or in addition to the display unit 5.

First, a description will be given of a file storing process whichstores a dictionary file of a dictionary, encyclopedia or the like intothe CD-ROM 6 a which is loaded into the CD-ROM input/output device 6.The file storing process generally includes a compression parametercomputing process, a data compression process for compressing an index,text and the like, an address information computing process, acompressed file composing process, and a compressed file storingprocess. In this embodiment, it is assumed for the sake of conveniencethat a program for causing the CPU 1 to carry out the file storingprocess is stored in the CD-ROM 6 a, and that the CPU 1 reads thisprogram from the CD-ROM 6 a by a known method and loads this programinto the auxiliary storage unit 3. In addition, it is assumed for thesake of convenience that the dictionary file related to the dictionary,encyclopedia or the like is transferred from a host unit (not shown) andis stored in the auxiliary storage unit 3 via the bus 9 or, read from aCD-ROM different from the CD-ROM 6 a by the CD-ROM input/output device 6and is stored in the auxiliary storage unit 3 via the bus 9.

1a) Compression Parameter Computing Process:

FIG. 2 is a flow chart for explaining the compression parametercomputing process carried out by the CPU 1. In FIG. 2, a step S1 makesaccess to the auxiliary storage unit 3 and opens a dictionary file. Astep S2 reads 1 character, that is, a 16-bit code, for example, from thedictionary file. A step S3 counts an appearance frequency of the read16-bit code by use of an appearance frequency counter within the CPU 1.A step S4 decides whether or not a last character of the dictionary fileis processed, and the process returns to the step S2 if the decisionresult in the step S4 is NO.

On the other hand, if the decision result in the step S4 is YES, a stepS5 closes the dictionary file. A step S6 sorts the 16-bit codesdepending on the order of the appearance frequency, and a step S7selects 1024 16-bit codes, for example, depending on the order of theappearance frequency. A step S8 decomposes the remaining non-selected16-bit codes into 8-bit codes, and calculates the appearance frequencyof the 8-bit code. A step S9 corrects the appearance frequency of the8-bit code with respect to the appearance frequency of the 16-bit code,by setting the appearance frequency of the 8-bit code to approximately½.

A step S10 opens a compression parameter save file for the compressionparameter in the auxiliary storage unit 3. A step S11 writes the 102416-bit codes and the appearance frequency thereof in the compressionparameter save file. In addition, a step S12 writes 256 8-bit codes andthe appearance frequency thereof in the compression parameter save file.A step S13 closes the compression parameter save file, and the processends.

FIG. 3 is a diagram showing the data structure of the compressionparameter. As shown in FIG. 3, in the case of a compression using theHuffman code, the compression parameter includes 256 kinds of appearancefrequencies for each of the 1024 kinds of 16-bit codes, and 256 kinds ofappearance frequencies for each of the 8-bit codes, for example. Theappearance frequencies become data which are used to generate a Huffmantree. In the case of a compression using the universal code, thecompression parameter includes a try tree or data such as registeredsymbol examples and reference numbers thereof which are used to generatethe try tree.

1b) Data Compression Process and Address Information Computing Process:

FIG. 4 is a flow chart for explaining the data compression process andthe address information computing process carried out by the CPU 1. InFIG. 4, a step S21 creates a conversion table, that is, a Huffman treesince this embodiment carries out a Huffman compression, based on theappearance frequencies of the 8-bit codes and the 16-bit codes. A stepS22 opens the dictionary file within the auxiliary storage unit 3. Astep S23 opens a compressed data save file for the compressed data andan address information save file for the address information, within theauxiliary storage unit 3.

A step S24 reads 1 section from the dictionary file. This section mayhave a fixed length or a variable length, but in this embodiment, it isassumed for the sake of convenience that this section has a fixedlength. This section is sometimes also referred to as a block. A stepS25 computes the compressed data of 1 section using the Huffman tree. Astep S26 adds an end code to the end of 1 section. In addition, a stepS27 writes the compressed data in the compressed data save file.

A step S28 computes the address information related to the address wherethe above described section is stored. For example, when the section hasthe fixed length, the address information is computed based on a sectionnumber which is assigned with respect to each section. A step S29 writesthe address information in the address information save file. A step S30decides whether or not a last section is processed, and the processreturns to the step S24 if the decision result in the step S30 is NO.For example, it is possible to decide whether or not the last section isprocessed, based on a last section code which is added to the sectionnumber or the last section.

On the other hand, if the decision result in the step S30 is YES, a stepS31 closes the save file for the compressed data and closes the addressinformation save file. In addition, a step S32 closes the dictionaryfile, and the process ends.

1c) Compressed File Composing Process and Compressed File StoringProcess:

FIG. 5 is a flow chart for explaining the compressed file composingprocess and the compressed file storing process carried out by the CPU1. In FIG. 5, a step S41 opens a compressed file within the auxiliarystorage unit 3. A step S42 opens the compression parameter save filewithin the auxiliary storage unit 3, and a step S43 copies thecompression parameter within the compression parameter save file to thecompressed file. A step S44 closes the compression parameter save file.

A step S45 opens the address information save file within the auxiliarystorage unit 3, and a step S46 copies the address information in theaddress information save file to the compressed file. A step S47 closesthe address information save file. Furthermore, a step S48 opens thecompressed data save file within the auxiliary storage unit 3, and astep S49 copies the compressed data in the compressed data save file tothe compressed file. A step S50 closes the compressed data save file. Astep S51 stores the compressed file in the CD-ROM 6 a by the CD-ROMinput/output device 6. In addition, a step S52 closes the compressedfile, and the process ends.

FIG. 6 is a diagram for explaining the composing of the compressed filewith reference to 1a) the compression parameter computing process, 1b)the data compression process and the address information computingprocess, and 1c) the compressed file composing process and thecompressed file storing process described above. In FIG. 6, (a) showsthe compression parameter. In this embodiment, the compression parameteris used to carry out the compression using the Huffman code. In FIG. 6,(b) shows the sections of the dictionary file. In this embodiment, eachsection is made up of 2 kbytes, for example, and each section is made upof a dictionary data and an index data. In the case of an encyclopedia,for example, the dictionary data includes a text data related to a textwhich explains the meaning of a word, an image data related to an imageshowing an animal if the word describes the animal, for example, anaudio data related to a sound such as a singing of a bird if the worddescribes the bird, for example, and the like. On the other hand, theindex is used to retrieve a desired dictionary data from the dictionaryfile, and is provided with respect to the dictionary data. The index issometimes also referred to as a keyword. The index data includes apointer related to a heading, a pointer related to an item, and thelike. The data related to the heading includes a headword. Further, thedata related to the item includes a headword, comment, and the like.

In FIG. 6, (c) shows the compressed data, in a state where each sectionhas a fixed length or a variable length and is compressed. Furthermore,in FIG. 6, (d) shows the address information computed with respect toeach section, and (e) shows the compressed file which is obtained bycomposing the address information and the compressed data and addingmanagement information at a head of the compressed file. The managementinformation includes information used when retrieving the compressedfile, such as a dictionary file name, a dictionary file type, and a typeof compression used for the dictionary file.

Next, a description will be given of a file retrieval process whichretrieves a desired data by reading a compressed file which is stored inthe CD-ROM 6 a which is loaded into the CD-ROM input/output device 6.The file retrieval process generally includes an index read process anda text read process, and is carried out by calling a data expansionprocess. In this embodiment, it is assumed for the sake of conveniencethat a program for causing the CPU 1 to carry out the file retrievalprocess is stored in the CD-ROM 6 a, and that the CPU 1 reads thisprogram and the compressed file from the CD-ROM 6 a by a known methodand loads the read program and compressed file into the auxiliarystorage unit 3.

2a) Index Read Process:

FIG. 7 is a flow chart for explaining the index read process carried outby the CPU 1. In FIG. 7, a step S61 sets address information of a mostsignificant index, based on the index data input by the user via theinput device 4. A step S62 calls the expansion process, and reads aroutine for carrying out the expansion process from the program which isstored in the auxiliary storage unit 3 and causes the CPU 1 to carry outthe file retrieval process, so as to expand the address of the mostsignificant index within the compressed file. A step S63 acquires theaddress of a significant index, that is, the head character of the mostsignificant index, based on the index data. A step S64 calls theexpansion process, and expands the address of the significant indexwithin the compressed file. A step S65 acquires the address of a lesssignificant index in a next hierarchical layer, based on the index data.A step S66 calls the expansion process, and expands the address of theless significant index in the next hierarchical layer described abovewithin the compressed file. A step S67 decides whether or not theexpansion of the address of a least significant index has ended, and theprocess returns to the step S65 if the decision result in the step S67is NO. On the other hand, the process ends if the decision result in thestep S67 is YES.

2b) Data Expansion Process:

FIG. 8 is a flow chart for explaining the data expansion process carriedout by the CPU 1. The data expansion process is called by the index readprocess and the text read process. In FIG. 8, a step S71 stores therequested expansion address, data size and storage region in theauxiliary storage unit 3, based on the index data which is input by theuser via the input device 4, so as to prepare a sufficiently largestorage region within the auxiliary storage unit 3 with respect to theexpanded data size. A step S72 decides whether or not the compressedfile which is read from the CD-ROM 6 a and loaded into the auxiliarystorage unit 3 is open. If the decision result in the step S72 is NO, astep S73 opens the compressed file within the auxiliary storage unit 3.A step S74 reads the compression parameter from the compressed file, andreads the appearance frequency of the 8-bit code within the compressionparameter, the 16-bit code within the compression parameter, and theappearance frequency of the 16-bit code. A step S75 creates a Huffmantree based on the appearance frequency of the 8-bit code and theappearance frequency of the 16-bit code, and the process advances to astep S76 which will be described later. A judging flag for judgingwhether the code is the 8-bit code or the 16-bit code is added to thedata of the leaf of the Huffman tree.

If the decision result in the step S72 is YES or after the step S75, thestep S76 reads the address information corresponding to the requestedexpansion address, from the compressed file. A step S77 reads thesection of the corresponding compressed data from the compressed file,based on the address information. A step S78 expands the section of thecompressed data by use of the Huffman tree. A step S79 copies theexpanded data to the storage region described above, based on thejudging flag which indicates whether the code is the 8-bit code or the16-bit code. Further, a step S80 decides whether or not the expansion ofthe requested data size is completed with respect to the compressedfile.

If the decision result in the step S80 is NO, a step S81 reads theaddress information corresponding to the expansion address of the nextsection, from the compressed file. The step S82 reads the section of thecorresponding compressed data from the compressed file, based on theaddress information corresponding to the expansion address of this nextsection, and the process returns to the step S78. On the other hand, theprocess ends if the decision result in the step S80 is YES.

2c) Text Read Process:

FIG. 9 is a flow chart for explaining the text read process carried outby the CPU 1. In FIG. 9, a step S91 counts the items matching the index,within the expanded data, based on the index data input by the user viathe input device 4. A step S92 sets a value of an item pointer of theindex to the address, based on the input index data. A step S93 callsthe expansion process, and reads a routine for carrying out theexpansion process from the program which is stored within the auxiliarystorage unit 3 and causes the CPU 1 to carry out the file retrievalprocess, so as to expand the text indicated by the item pointer withinthe compressed file, that is, to expand the dictionary data amounting to1 section.

A step S94 decides whether or not the dictionary data indicated by theitem pointer has ended. If the decision result in the step S94 is NO, astep S95 sets the address of a next 1 section. In addition, a step S96calls the expansion process, and expands the dictionary data amountingto this next 1 section indicated by the item pointer within thecompressed file, and the process returns to the step S94. On the otherhand, if the decision result in the step S94 is YES, a step S97 decideswhether or not the process with respect to all of the items has ended,based on the input index data. The process returns to the step S92 ifthe decision result in the step S97 is NO. On the other hand, if thedecision result in the step 97 is YES, a step S98 displays on thedisplay unit 5 the dictionary data which is expanded for all of theitems, and the process ends.

It is possible to carry out the step S98 before the step S97. In thiscase, the step S98 displays the dictionary data which is expanded foreach item on the display unit 5 each time the dictionary data isexpanded for each item.

In the embodiment described above, it is assumed for the sake ofconvenience that the section has the fixed length. In this case, thedata compression efficiency is satisfactory, and it is possible torestore the address information from the compressed file without theneed to store the address information prior to the compression of thesection in the compressed file. This is because the section has thefixed length, and the section number is added to each section, therebymaking it possible to calculate a relative position of each section withrespect to another section.

On the other hand, when the section has a variable length, it ispossible to improve the data expansion rate. This is because the lengthof the section can be set appropriately depending on the kind of dataand section, thereby eliminating the need to expand excess data. In thiscase where the section has the variable length, it is necessary to storethe address prior to the compression of the section in the compressedfile. Accordingly, it is possible to make the section have the fixedlength or the variable length, depending on whether the priority is tobe given to the data compression efficiency or the data expansion rate.

In addition, one or more dictionary files may be stored in the CD-ROM 6a. When a plurality of dictionary files related to a plurality ofdictionaries, encyclopedias and the like are stored in the CD-ROM 6 a,it is possible to specify the dictionary which is to be retrieved, usingthe dictionary file name or the dictionary file type within themanagement information shown in (e) of FIG. 6.

Furthermore, although the embodiment described above employs the Huffmancode for the data compression, it is of course possible to use codingtechniques other than the technique using the Huffman code, such as thetechnique using the universal code, as long as the employed datacompression technique is capable of efficiently compressing thedictionary data using a common compression parameter for each of thesections. In addition, the data to be subjected to the data compressionand expansion is not limited to the dictionary data, and includes dataof a database including the index and data.

Moreover, in the embodiment described above, the file retrieval processis carried out by copying the program for carrying out the fileretrieval process and the compressed file to the auxiliary storage unit3. However, instead of copying the program and the compressed file tothe auxiliary storage unit 3, it is possible to develop the program andthe compressed file in the main storage unit 2, and carry out a processsimilarly to that described above.

By employing the compression algorithm of the above describedembodiment, it is possible to improve the data compression efficiencycompared to the normal data compression process using the Huffman codewith 8 bits. As a result, it is possible to reduce the region of thecompressed file stored in the storage medium such as the CD-ROM and thehard disk which is used as the auxiliary storage unit. Although the datacompression efficiency is improved by this compression algorithm, theprocessing time required to expand the compressed file remainsessentially unchanged from the processing time required to expand thecompressed file compressed by the normal data compression process usingthe Huffman code.

A time required to carry out the file retrieval process is made up of aseek time of the read unit (drive), a read time required to read thecompressed file, and a time required to carry out the expansion process.

Since the data compression efficiency is improved by the compressionalgorithm described above, the reduced region of the compressed filestored in the storage medium enables reduction of the seek time of thefile retrieval process. Consequently, the file retrieval speed isimproved. This effect of improving the file retrieval speed is becomesmore notable as the hardware performance improves.

Further, the present invention is not limited to these embodiments, butvarious variations and modifications may be made without departing fromthe scope of the present invention.

What is claimed is:
 1. A file processing method comprising: dividingdata and index data different from and corresponding to the data into aplurality of sections, the index data being used in searching orretrieving the data and each of the sections including both data andindex data, the data including at least one element selected from agroup consisting of text data, image data and audio data; compressingthe sections to obtain a compressed file; and storing the compressedfile in a storage medium together with address information of thesections after compression.
 2. The file processing method as claimed inclaim 1, wherein each section has a fixed length.
 3. The file processingmethod as claimed in claim 1, wherein each section has a variablelength, and said storing further stores address information prior to thecompression in the storage medium.
 4. The file processing method asclaimed in claim 1, which further comprises: restoring by reading thecompressed file from the storage medium and expanding each of thesections, so as to restore both the data and the index data.
 5. The fileprocessing method as claimed in claim 4, which further comprises:storing the restored data and the restored index data in an auxiliarystorage unit.
 6. The file processing method as claimed in claim 1,wherein said compressing uses a compression algorithm and a compressionparameter which are common to both the data and the index data of eachof the sections.
 7. The file processing method as claimed in claim 1,wherein said compressing selects a predetermined number of first bitcodes within the data depending on an order of an appearance frequencythereof, decomposes remaining non-selected first bit codes into secondbit codes, creates a conversion table based on a result of selecting thesecond bit codes depending on an order of an appearance frequencythereof, and carries out a data compression based on the conversiontable.
 8. The file processing method as claimed in claim 1, wherein thedata includes dictionary data.
 9. A file processing method comprising:reading a compressed file from a storage medium together with addressinformation of each of a plurality of sections after compression, foreach of the sections, said compressed file being obtained by dividingdata and index data different from and corresponding to the data intothe sections, the index data being used in searching or retrieving thedata and each of the sections including both data and index data, andcompressing the sections, the data including at least one elementselected from a group consisting of text data, image data and audiodata; expanding the compressed file; and restoring both the data and theindex data.
 10. The file processing method as claimed in claim 9, whichfurther comprises: storing both the restored data and the restored indexdata in an auxiliary storage unit.
 11. The file processing method asclaimed in claim 9, wherein said expanding carries out a data expansionbased on a conversion table which is obtained at a time of thecompression by selecting a predetermined number of first bit codeswithin the data depending on an order of an appearance frequencythereof, decomposes remaining non-selected first bit codes into secondbit codes, and creates the conversion table based on a result ofselecting the second bit codes depending on an order of an appearancefrequency thereof.
 12. The file processing method as claimed in claim 9,wherein each section has a fixed length.
 13. The file processing methodas claimed in claim 9, wherein each section has a variable length, andaddress information prior to the compression is stored in the storagemedium.
 14. The file processing method as claimed in claim 9, whereinthe data includes dictionary data.
 15. A data processing apparatuscomprising: a compression unit to divide data and index data differentfrom and corresponding to the data into a plurality of sections, theindex data being used in searching or retrieving the data and each ofthe sections including both data and index data, the data including atleast one element selected from a group consisting of text data, imagedata and audio data, and to compress the sections to obtain a compressedfile; and a storing unit to store the compressed file in a storagemedium together with address information of the sections aftercompression.
 16. The data processing apparatus as claimed in claim 15,wherein each section has a fixed length.
 17. The data processingapparatus as claimed in claim 15, wherein each section has a variablelength, and said storing unit stores address information prior to thecompression in the storage medium.
 18. The data processing apparatus asclaimed claim 15, which further comprises: a restoring unit to read thecompressed file from the storage medium and expanding each of thesections, so as to restore both the data and the index data.
 19. Thedata processing apparatus as claimed in claim 18, which furthercomprises: an auxiliary storage unit to store both the restored data andthe restored index data.
 20. The data processing apparatus as claimed inclaim 15, wherein said compression unit uses a compression algorithm anda compression parameter which are common to both the data and the indexdata of each of the sections.
 21. The data processing apparatus asclaimed in claim 15, wherein said compression unit selects apredetermined number of first bit codes within the data depending on anorder of an appearance frequency thereof, decomposes remainingnon-selected first bit codes into second bit codes, creates a conversiontable based on a result of selecting the second bit codes depending onan order of an appearance frequency thereof, and carries out datacompression based on the conversion table.
 22. The data processingapparatus as claimed in claim 15, wherein the data includes dictionarydata.
 23. A data processing apparatus comprising: a reading unit to reada compressed file from a storage medium together with addressinformation of each of a plurality of sections after compression, foreach of the sections, said compressed file being obtained by dividingdata and index data different from and corresponding to the data intothe section, the index data being used in searching or retrieving thedata and each of the sections including both data and index data, thedata including at least one element selected from a group consisting oftext data, image data and audio data, and compressing the sections; anda restoring unit to expand the compressed file and to restore both thedata and the index data.
 24. The data processing apparatus as claimed inclaim 23, which further comprises: an auxiliary storage unit to storeboth the restored data and the restored index data.
 25. The dataprocessing apparatus as claimed in claim 23, wherein said restoring unitcarries out data expansion based on a conversion table which is obtainedat a time of the compression by selecting a predetermined number offirst bit codes within the data depending on an order of an appearancefrequency thereof, decomposes remaining non-selected first bit codesinto second bit codes, and creates the conversion table based on aresult of selecting the second bit codes depending on an order of anappearance frequency thereof.
 26. The data processing apparatus asclaimed in claim 23, wherein each section has a fixed length.
 27. Thedata processing apparatus as claimed in claim 23, wherein each sectionhas a variable length, and address information prior to the compressionis further stored in the storage medium.
 28. The data processingapparatus as claimed in claim 23, wherein the data includes dictionarydata.
 29. A storage medium which stores computer-readable informationcausing a computer to read and restore a compressed file by: reading acompressed file from a storage medium together with address informationof each of a plurality of sections after compression, for each of thesections, said compressed file being obtained by dividing data and indexdata different from and corresponding to the data into the sections, theindex data being used in searching or retrieving the data and each ofthe sections including both data and index data, the data including atleast one element selected from a group consisting of text data, imagedata and audio data, and compressing the sections; expanding thecompressed file; and restoring both the data and the index data.
 30. Thestorage medium as claimed in claim 29, which further comprises: storingboth the restored data and the restored index data.
 31. The storagemedium as claimed in claim 29, wherein each section has a fixed length.32. The storage medium as claimed in claim 29, wherein each section hasa variable length, and said reading a compressed file includes readingaddress information prior to the compression from the storage medium.33. The storage medium as claimed in claim 29, wherein the compressedfile is compressed using a compression algorithm and a compressionparameter which are common to both the data and the index data of eachof the sections.
 34. The storage medium as claimed in claim 29, whereinthe data includes dictionary data.
 35. A storage medium which storescomputer-readable information by: dividing data and index data differentfrom and corresponding to the data into sections, the index data beingused in searching or retrieving the data and each of the sectionsincluding both data and index data, the data including at least oneelement selected from a group consisting of text data, image data andaudio data; compressing each of the sections to obtain a compressed fileby using a compression algorithm and a compression parameter which arecommon to both the data and the index data of each of the sections; andstoring the compressed file together with address information of each ofa plurality of the sections after compression.
 36. The storage medium asclaimed in claim 35, wherein each section has a fixed length.
 37. Thestorage medium as claimed in claim 35, wherein each section has avariable length, and further storing address information prior to thecompression.
 38. The storage medium as claimed in claim 35, wherein thedata includes dictionary data.
 39. A storage medium which storescomputer-readable information, including a program which causes acomputer to carry out: a compressing procedure dividing dictionary dataand index data different from and corresponding to the dictionary datainto a plurality of sections, the index data being used in searching orretrieving the dictionary data and each of the sections including bothdata and index data, the data including at least one element selectedfrom a group consisting of text data, image data and audio data, andcompressing the sections to obtain a compressed dictionary file; and astoring procedure storing the compressed dictionary file in the storagemedium together with address information of the sections aftercompression.
 40. The storage medium as claimed in claim 39, wherein eachsection has a fixed length.
 41. The storage medium as claimed in claim39, wherein each section has a variable length, and said storingprocedure further stores address information prior to the compression inthe storage medium.
 42. The storage medium as claimed in claim 39,wherein the compressed dictionary file is compressed using a compressionalgorithm and a compression parameter which are common to both thedictionary data and the index data for each of the sections.
 43. Thestorage medium as claimed in claim 39, which further stores a programfor causing the computer to carry out: a procedure reading thecompressed dictionary file from the storage medium for each of thesections and expanding the compressed dictionary file, so as to restoreboth the dictionary data and the index data.
 44. The storage medium asclaimed in claim 43, which further stores a program for causing thecomputer to carry out: a procedure storing both the restored dictionarydata and the restored index data in an auxiliary storage unit.
 45. Thestorage medium as claimed in claim 39, which further stores a programfor causing the computer to carry out: a procedure selecting apredetermined number of first bit codes within the dictionary datadepending on an order of an appearance frequency thereof, decomposingremaining non-selected first bit codes into second bit codes, creating aconversion table based on a result of selecting the second bit codesdepending on an order of an appearance frequency thereof, and carryingout a data compression based on the conversion table.
 46. Acomputer-readable storage medium storing a compressed file comprising: acompressed data region storing compressed data obtained by dividing dataand index data different from and corresponding to the data into aplurality of sections, the index data used in searching or retrievingthe data and each of the sections including both data and index data,the data including at least one element selected from a group consistingof text data, image data and audio data, and compressing the sections;an address information region storing address information aftercompression of the sections; and a compression parameter region storinga compression parameter used for the compression.
 47. Thecomputer-readable storage medium as claimed in claim 46, wherein saidcompression parameter includes a predetermined number of first bit codeswithin the data selected depending on an order of an appearancefrequency thereof and the appearance frequency of the first bit codes,and second bit codes obtained by decomposing remaining non selectedfirst bit codes depending on an order of an appearance frequency thereofand the appearance frequency of the second bit codes.